In the present report, I tested the effect of unequal or unbalanced classes on MVPA performance. For the first classification problem, a support vector machine (SVM) model was trained to differentiate between patterns of brain activity when a participant is viewing an animal v. tool. For more information about the SVM models, please see the About page. Class sizes in the training data were manually adjusted to the following ratios - 2:3, 1:2, 1:3. These were compared to two balanced conditions- 1:1 and .5:.5. The 1:1 condition contained all observations for both classes. The .5:.5 condition was included to compare the effect of downsampling to balance the class sizes with fewer observations.
Classification performance decreases as the class sizes become more unbalanced, though accuracy remains above chance for all ratio conditions (indicated by dashed line). Importantly, classification accuracy in the .5:.5 ratio condition was worse than in the other balanced 1:1 condition, suggesting that downsampling was not a viable solution when the number of observations per class was so small.
When I broke classification accuracy down by trial timing condition, I saw a similar decrease in performance as the classes became more unbalanced. This decrease was similar across trial timing conditions.
To evaluate bias in the classifier, I next looked at the hit rate and false alarm rate split by the larger v. smaller class. As the class sizes became more unbalanced, the hit and false alarm rates for larger class increased while the hit and false alarm rates for the smaller class decreased. This suggests that bias in the classifier to predict the larger class increased as the classes become more imbalanced.
The classifier’s bias towards predicting the larger class was consistent across trial timing conditions. All timing conditions showed a similar increase in bias as the classes became more unbalanced.
I then broke down the false alarm rate when each category served as the smaller class. In the balanced conditions, there was no bias to false alarm to one category over the other. This was true even in the .5:.5 ratio condition, even though overall accuracy of the classifier suffered. There was a similar effect in the unbalanced conditions, with false alarm rates comparable between the two categories.
For the second classification problem, a support vector machine (SVM) model was trained to differentiate between patterns of brain activity associated with remembered v. forgotten trials. For more information about the SVM models, please see the About page. As people retain information at different rates, this allowed me to probe the effect of unequal class sizes in a real-world example.
I first looked at how classification performance changed as a function of the number of trials remembered. Note that as the number of remembered trials increase, the classes (remembered v. forgotten) become more unbalanced. Balanced class sizes (number of remembered trials = 6) are marked in green. Since the classes in the testing set were also imbalanced, corrected hit rate (hit rate - false alarm rate) was used to index classification performance. Unexpectedly, as the number of remembered trials increased
I then looked at classification responses when there were fewer forgotten trials (panel 1), fewer remembered trials (panel 2), or an equal number of forgotten v. remembered trials (panel 3). Importantly, even when the classes were balanced, there was a bias in the classifier to predict trials were remembered. This bias persisted when there were more remembered trials than forgotten ones.
Finally, I looked at the change in bias as the number of remembered trials increased. Bias was measured by the rate at which the classifier false alarmed to one class over the other, specifically the false alarm rate for remembered trials - false alarm rate for gotten trials. Visual inspection of the graphs show that bias increased as the number of remembered trials increased for the quick6 and the slow12 trial timing conditions. In other words, as the number of remembered trials increased and the classes become more imbalanced, the classifier is more likely to false alarm and predict remembered trials. The quick4n trial timing condition appeared to be more robust against this bias.
In functional MRI multivoxel pattern analysis (MVPA), we are sometimes faced with classification problems where the training set contains unequal class sizes. For example, if we are trying to predict activity for remembered v. forgotten pictures, we may find that there are more remembered than forgotten trials. Unequal observations between the classes may bias the classifier to predict the more frequently observed class. In other words, if trained on more “remembered” trials, the classifier may be more likely to predict trials in the untrained set as “remembered.” While downsampling is a common solution to unequal class sizes, it’s not always an ideal approach when working with fMRI data. The available training data in MVPA is often small and downsampling can exacerbate this problem. This report is intended to showcase the influence of unbalanced or unequal class sizes on MVPA performance and show how downsampling is not always the solution. I will highlight two classification problems, one with manually produced imbalance and another with inherent imbalance.
The data was taken from a previously conducted study that tested the effects of trial timing on fMRI pattern analysis (Zeithamova et al., 2017). While undergoing MRI, participants intentionally encoded pictures of animals and tools. The encoding task was run using five different trial timing conditions with two blocks or runs per condition (10 runs total). The data from three of the trial timing conditions were using in this report: jittered 4-8 s trials (quick4n), 6 s trials (quick6), 12 s trials (slow12). In each run, participants studied 12 pictures (6 animals + 6 tools). Following the MRI scan, memory for the items were tested in an old v. new recognition task.
For the first classification problem, I trained an support vector machine (SVM) to differentiate between patterns of brain activity when a participant is viewing an animal v. tool (decoding category). In this problem, the number of observations for animals and tools in the training data were manually adjusted to the following ratios - 2:3, 1:2, 1:3. Each category (animal v. tool) served as the smaller class. Observations for the smaller class were randomly sampled and used to train the SVM model. This sampling process was repeated 5 times. These were compared to two balanced conditions- 1:1 and .5:.5. The 1:1 condition contained all observations for both classes. The .5:.5 condition was included to compare the effect of downsampling to equal class sizes but with half of the available observations.
In the second classification problem, I trained an SVM to differentiate the patterns of brain activity associated with remembered v. forgotten trials. This type of classification problem often produces unequal class sizes as people differ in how well they can remember information.
Brain activity for each trial was first estimated using general linear model. Estimates of brain activity across voxels (i.e., 3D pixels) were vectorized to serve as the “features” or input to the SVM models. I used linear SVM from the e1071 package, with the default C = 1. Univariate feature selection was used to select the top 100 voxels. The model was trained using a leave-one-run-out cross-validation approach.